Independence of Tabulation-Based Hash Classes
نویسندگان
چکیده
A tabulation-based hash function maps a key into d derived characters indexing random values in tables that are then combined with bitwise xor operations to give the hash. Thorup and Zhang [8] presented d-wise independent tabulation-based hash classes that use linear maps over finite fields to map a key, considered as a vector (a, b), to derived characters. We show that a variant where the derived characters are a + b · i for i = 0, . . . , q − 1 (using integer arithmetic) yielding (2d− 1)-wise independence. Our analysis is based on an algebraic property that characterizes k-wise independence of tabulation-based hashing schemes, and combines this characterization with a geometric argument. We also prove a nontrivial lower bound on the number of derived characters necessary for k-wise independence with our and related hash classes.
منابع مشابه
Approximately Minwise Independence with Twisted Tabulation
A random hash function h is ε-minwise if for any set S, |S| “ n, and element x P S, Prrhpxq “ minhpSqs “ p1 ̆ εq{n. Minwise hash functions with low bias ε have widespread applications within similarity estimation. Hashing from a universe rus, the twisted tabulation hashing of Pǎtraşcu and Thorup [SODA’13] makes c “ Op1q lookups in tables of size u1{c. Twisted tabulation was invented to get good ...
متن کاملAn Improved Hash Function Based on the Tillich-Zémor Hash Function
Using the idea behind the Tillich-Zémor hash function, we propose a new hash function. Our hash function is parallelizable and its collision resistance is implied by a hardness assumption on a mathematical problem. Also, it is secure against the known attacks. It is the most secure variant of the Tillich-Zémor hash function until now.
متن کاملAppendix for Tabulation Based 4-Universal Hashing with Applications to Second Moment Estimation
متن کامل
Practical Hash Functions for Similarity Estimation and Dimensionality Reduction
Hashing is a basic tool for dimensionality reduction employed in several aspects of machine learning. However, the perfomance analysis is often carried out under the abstract assumption that a truly random unit cost hash function is used, without concern for which concrete hash function is employed. The concrete hash function may work fine on sufficiently random input. The question is if they c...
متن کاملLecture 10 — March 20 , 2012
In the last lecture, we finished up talking about memory hierarchies and linked cache-oblivious data structures with geometric data structures. In this lecture we talk about different approaches to hashing. First, we talk about different hash functions and their properties, from basic universality to k-wise independence to a simple but effective hash function called simple tabulation. Then, we ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012